Strategies and Implementation for Translating OpenMP Code for Clusters

نویسندگان

  • Deepak Eachempati
  • Lei Huang
  • Barbara M. Chapman
چکیده

OpenMP is a portable shared memory programming interface that promises high programmer productivity for multithreaded applications. It is designed for small and middle sized shared memory systems. We have developed strategies to extend OpenMP to clusters via compiler translation to a Global Arrays program. In this paper, we describe our implementation of the translation in the Open64 compiler, and we focus on the strategies to improve sequential region translations. Our work is based upon the open source Open64 compiler suite for C, C++, and Fortran90/95.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

“Horses for Courses”; Comment on “Translating Evidence Into Healthcare Policy and Practice: Single Versus Multi-Faceted Implementation Strategies – Is There a Simple Answer to a Complex Question?”

This commentary considers the vexed question of whether or not we should be spending time and resources on using multifaceted interventions to undertake implementation of evidence in healthcare. A review of systematic reviews has suggested that simple interventions may be just as effective as those taking a multifaceted approach. Taking cognisance of the Promoting Action on Research Implementat...

متن کامل

Skeleton driven transformations for an OpenMP compiler

In this paper we present a technique based on code templates, oriented to source to source code transformations for OpenMP parallelization. Our goal is to provide an OpenMP compilation infrastructure that includes a reconfigurable code generation phase, targetting different OpenMP runtime systems or explore different translation strategies for OpenMP constructs. We describe the main OpenMP tran...

متن کامل

Accelerating high-order WENO schemes using two heterogeneous GPUs

A double-GPU code is developed to accelerate WENO schemes. The test problem is a compressible viscous flow. The convective terms are discretized using third- to ninth-order WENO schemes and the viscous terms are discretized by the standard fourth-order central scheme. The code written in CUDA programming language is developed by modifying a single-GPU code. The OpenMP library is used for parall...

متن کامل

Translating Evidence into Healthcare Policy and Practice: Single Versus Multi-Faceted Implementation Strategies – Is There a Simple Answer to a Complex Question?

How best to achieve the translation of research evidence into routine policy and practice remains an enduring challenge in health systems across the world. The complexities associated with changing behaviour at an individual, team, organizational and system level have led many academics to conclude that tailored, multifaceted strategies provide the most effective approach to knowledge translati...

متن کامل

Multi-level parallelism for incompressible flow computations on GPU clusters

We investigate multi-level parallelism on GPU clusters with MPI-CUDA and hybrid MPI-OpenMP-CUDA parallel implementations, in which all computations are done on the GPU using CUDA. We explore efficiency and scalability of incompressible flow computations using up to 256 GPUs on a problem with approximately 17.2 billion cells. Our work addresses some of the unique issues faced when merging fine-g...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007